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Abstract 



Clique-width is a graph invariant that has been widely studied in combinatorics and computer 
science. However, computing the clique-width of a graph is an intricate problem, the exact clique- 
^^ width is not known even for very small graphs. We present a new method for computing the 

clique-width of graphs based on an encoding to propositional satisfiability (SAT) which is then 
^ evaluated by a SAT solver. Our encoding is based on a reformulation of clique-width in terms of 

^M partitions that utilizes an efficient encoding of cardinality constraints. Our SAT-based method is 

^^ the first to discover the exact clique-width of various small graphs, including famous graphs from 

^\ the literature as well as random graphs of various density. With our method we determined the 

^~H smallest graphs that require a small pre-described clique-width. 

^ 1 Introduction 

(/5 Clique-width is a fundamental graph invariant that has been widely studied in combinatorics and com- 

I ^1 puter science. Clique-width measures in a certain sense the "complexity" of a graph. It is defined via a 

graph construction process involving four operations where only a limited number of vertex labels are 

^~~^ available; vertices that share the same label at a certain point of the construction process must be treated 

^ uniformly in subsequent steps. This graph composition mechanism was first considered by Courcelle, 

Engelfriet, and Rozenberg [lOi ill] and has since then been an important topic in combinatorics and 

^1.' computer science. 

|/~j Graphs of small clique-width have advantageous algorithmic properties. Algorithmic meta-theorems 

l' show that large classes of NP-hard optimization problems and #P-hard counting problems can be solved 

(^ in linear time on classes of graphs of bounded clique- width [7l [8] . Similar results hold for the graph 

CO invariant treewidth, however, clique-width is more general in the sense that graphs of small treewidth 

^~^ also have small clique-width, but there are graphs of small clique-width but arbitrarily high treewidth 

^ mil]- Unlike treewidth, dense graphs (e.g., cliques) can also have small clique-width. 

All these algorithms for graphs of small clique-width require that a certificate for the graph having 
small clique-width is provided. However, it seems that computing the certificate, or just deciding whether 
J^ the clique-width of a graph is bounded by a given number, is a very intricate combinatorial problem. 

More precisely, given a graph G and an integer fc, deciding whether the clique-width of G is at most k 
is NP-complete [IB]. Even worse, the clique- width of a graph with n vertices of degree greater than 2 
cannot be approximated by a polynomial-time algorithm with an absolute error guarantee of n'^ unless 
P — NP, where < e < 1 [IBj. In fact, it is even unknown whether graphs of clique- width at most 4 can 
be recognized in polynomial time [S]. There are approximation algorithms with an exponential error 
that, for fixed k, compute /(fc)-expressions for graphs of clique-width at most k in polynomial time 
(where f{k) = {2^''+^ - 1) by [3D], and f{k) = 8'^' - 1 by I29j). 

Because of this intricacy of this graph invariant, the exact clique- width is not known even for very 
small graphs. 
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Clique-width via SAT We present a new method for determining the chque-width based on a 
sophisticated SAT encoding which entails the following ideas: 

1. Reformulation. The conventional construction method for determining the clique- width of a graph 
consists of many steps. In the worst case, the number of steps is quadratic in the number of 
vertices. Translating this construction method into SAT would result in large instances, even for 
small graphs. We reformulated the problem in such a way that the number of steps is less than the 
number of vertices. The alternative construction method allows us to compute the clique-width 
of much larger graphs. 

2. Representative encoding. Applying the frequently-used direct encoding [35j on the reformulation 
results in instances that have no arc consistency |18| . i.e., unit propagation may find conflicts much 
later than required. We developed the representative encoding that is compact and realizes arc 
consistency. 

Experimental Results The implementation of our method allows us for the first time to determine 
the exact clique-width of various graphs, including famous graphs known from the literature, as well as 
random graphs of various density. 

1. Clique-width of .small Random Graphs. We determined experimentally how the clique-width of 
random graphs depends on the density. The clique- width is small for dense and sparse graphs and 
reaches its maximum for edge-probability 0.5. The larger n, the steeper the increase towards 0.5. 
These results complement the asymptotic results of Lee et al. [37] . 

2. Smallest Graphs of Gertain Glique-width. In general it is not known how many vertices are re- 
quired to form a graph of a certain clique-width. We provide these numbers for clique-width 
fc e {1, . . . , 7}. In fact, we could compute the total number of connected graphs (modulo isomor- 
phism) with a certain clique-width with up to 10 vertices. For instance, there are only 7 connected 
graphs with 8 vertices and clique- width 5 (modulo isomorphism), and no graphs with 9 vertices 
and clique-width 6. There are 68 graphs with 10 vertices and clique-width 6. The smallest one 
has 18 edges. 

3. Glique-width of Famous Named Graphs. Over the last 50 years, researchers in graph theory have 
considered a large number of special graphs. These special graphs have been used for counterex- 
amples for conjectures or for showing the tightness of combinatorial results. We considered several 
prominent graphs from the literature and computed their exact clique-width. These results may 
be of interest for people working in combinatorics and graph theory. 

Related Work We are not aware of any implemented algorithms that compute the clique-width 
exactly or hcuristically. However, algorithms have been implemented that compute upper bounds on 
other width-based graph invariants, including treewidth [HI [191 [26], branchwidth [33], Boolean-width [24], 
and rank-width [2]. Samer and Veith [3T] proposed a SAT encoding for the exact computation of 
treewidth. Boolean-width and rank-width can be used to approximate clique-width, however, the error 
can be exponential in the clique- width; in contrast, treewidth and branchwidth can be arbitrarily far 
from the clique-width, hence the approximation error is unbounded [3]. 

Our SAT encoding is based on a new characterization of clique-width that is based on partitions 
instead of labels. A similar partition-based characterization of clique-width, has been proposed by 
Heggernes et al. |23| . There are two main differences to our reformulation. Firstly, our characterization 
of clique- width uses three individual properties that can be easily expressed by clauses. Secondly, our 
characterization admits the "parallel" processing of several parts of the graph that are later joined 
together. 



2 Preliminaries 

2.1 Formulas and Satisfiability 

We consider prepositional formulas in Conjunctive Normal Form {CNF formulas, for short), which are 
conjunctions of clauses, where a clause is a disjunction of literals, and a literal is a propositional variable 
or a negated propositional variables. A CNF formula is satisfiable if its variables can be assigned true 
or false, such that each clause contains either a variable set to true or a negated variable set to false. 
The satisfiability problem (SAT) asks whether a given formula is satisfiable. 

2.2 Graphs and Clique- width 

All graphs considered are finite, undirected, and without self-loops. We denote a graph G by an ordered 
pair {V{G),E{G)) of its set of vertices and its set of edges, respectively. An edge between vertices u 
and V is denoted uv or equivalently vu. For basic terminology on graphs we refer to a standard text 
book [ig. 

Let A; be a positive integer. A k-graph is a graph whose vertices are labeled by integers from 
{1, . . . , fc}. We consider an arbitrary graph as a /c-graph with all vertices labeled by 1. We call the 
fc-graph consisting of exactly one vertex v (say, labeled by i) an initial fc-graph and denote it by i{v). 
The clique-width of a graph G is the smallest integer k such that G can be constructed from initial 
fc-graphs by means of repeated application of the following three operations. 

1. Disjoint union (denoted by ®); 

2. Relabeling: changing all labels i to j (denoted by Pi^j); 

3. Edge insertion: connecting all vertices labeled by i with all vertices labeled by j, i ^ j (denoted 
by rjij or ?7j,i); already existing edges are not doubled. 

A construction of a fc-graph using the above operations can be represented by an algebraic term composed 
of 0, pi-fj, and rjij, {i,j S {1, . . . , fc}, and i ^ j). Such a term is called a k-expression defining G. Thus, 
the clique-width of a graph G is the smallest integer fc such that G can be defined by a fc-expression. 

Example 1. The graph P4 ~ ({a, 5, c, d}, {ab, be, cd}) is defined by the 3-expression 

V2Ap2^i{mAviAHa) e 2(6)) ® 3(c))) ® 2(d)). 

Hence cwd(P4) < 3. In fact, one can show that P4 it has no 2-expression, and thus cwd(P4) — 3 ^ . H 

2.3 Partitions 

As partitions play an important role in our reformulation of clique-width, we recall some basic termi- 
nology. A partition of a set 5* is a set P of nonempty subsets of S such that any two sets in P are 
disjoint and S is the union of all sets in P. The elements of P are called equivalence classes. Let P, P' 
be partitions of S. Then P' is a refinement of P if for any two elements x,y € S that are in the same 
equivalence class of P' are also in the same equivalence class of P (this entails the case P = P'). 

3 A Reformulation of Clique-width without Labels 

Initially, we developed a SAT encoding of clique-width based on fc-expressions. Even after several 
optimization steps, this encoding was only able to determine the clique-width of graphs consisting of 
at most 8 vertices. We therefore developed a new encoding based on a reformulation of clique-width 
which does not use fc-expressions. In this section we explain this reformulation, in the next section we 
will discuss how it can be encoded into SAT efficiently. 

Consider a finite set V, the universe. A template T consists of two partitions cmp(T) and grp(r) 
of V . We call the equivalence classes in cmp(r) the components of T and the equivalence classes in 



grp(T) the groups of T. For some intuition about these concepts, imagine that components represent 
induced subgraphs and that groups represent sets of vertices in some component with the same label 
in a fc-expression. A derivation of length t is a finite sequence V = (Tq, . . . ,Tt) satisfying the following 
conditions. 

Dl |cmp(To)| = |y| and |cmp(rt)| = l. 

D2 grp(ri) is a refinement of cmp(ri), < i < t. 

D3 cmp(ri_i) is a refinement of cmp(Ti), 1 < i < t. 

D4 grp(ri_i) is a refinement of grp(Ti), 1 < i < t. 

We would like to note that Dl and D2 together imply that |grp(ro)| — 1^1- Thus, in the first template 
To all equivalence classes (groups and components) are singletons, and when we progress through the 
derivation, some of these sets are merged, until all components are merged into a single component in 
the last template T^. 

The width of a component C G cmp(T) is the number of groups g G grp(T) such that g '^C . The 
width of a template is the maximum width over its components, and the width of a derivation is the 
maximum width over its templates. A k-derivation. is a derivation of width at most k. A derivation 
V = (To, . . . ,Tt) is a derivation of a graph G = {V, E) if V is the universe of the derivation and the 
following three conditions hold for all 1 < i < t. 

Edge Property: For any two vertices u,v ^ V such that uv G E, if u,v are in the same group in Tj, 
then u, V are in the same component in Ti_i. 

Neighborhood Property: For any three vertices u,v,w Cz V such that uv ^ E and uw ^ E^ if v, w are in 
the same group in Tj, then u, v are in the same component in Ti_i. 

Path Property: For any four vertices u,v,w,x £ V, such that uv,uuj,vx € E and wx ^ E, if u,x are 
in the same group in Tj and v, w are in the same group in Tj, then u, v are in the same component 
in Ti_i. 

The neighborhood property and the path property could be merged into a single property if we do not 
insist that all mentioned vertices are distinct. However, two separate properties provide a more compact 
SAT encoding. 

The following example illustrates that a derivation can define more than one graph, in contrast to a 
/c-expression, which defines exactly one graph. 

Example 2. Consider the derivation V — (Tq, . . . , T3) with universe V = {a, b, c, d} and 

cmp(To) = {{a},{5},{c},{d}}, grp(To) = {{a} , {b} , {c} , {d}} , 

cmp(Ti) = {{a,b},{c},{d}}, grp(Ti) = {{a} , {b} , {c} , {d}} , 

cmp(T2) - {{a,b,c},{d}}, grp(T2) = {{a} , {b} , {c} , {d}} , 

cmp(T3) = {{a,b,c,d}}, gYp{T-,) = {{a,b},{c},{d}}. 

The width of V is 3. Consider the graph G = {V,{ab,ad,bc,bd}). To see that I? is a 3-derivation 
of G, we need to check the edge, neighborhood, and path properties. We observe that a, b are the only 
two vertices such that ab G E{G) and both vertices appear in the same group of some Tj (here, we 
have i — Z). To check the edge property, we only need to verify that a, 6 are in the same component 
of T2, which is true. For the neighborhood property, the only relevant choice of three vertices is a, &, c 
{be G E(G), ac ^ E{G), and a, b in a group of T3). The neighborhood property requires that 6, c are in 
the same component in T2, which is the case. The path property is satisfied since there is no template 
in which two pairs of vertices belong to the same group, respectively. 

Similarly we can verify that I? is a derivation of the graph G" = {V,{ab, be, cd}). In fact, for all 
connected graphs with four vertices, there exists an isomorphic graph that is defined by V (see Figurell]). 
However, V is not a derivation of the graph G" = {V, {ab, ac, bd, ed}) since the neighborhood property 
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Figure 1: All connected graphs with four vertices (up to isomorphism). The 3-derivation of Example p^ 
defines all six graphs. The clique-width for all, but the first graph is 2. 



is violated: bd G E{G") and ad ^ E{G"), a, b belong to the same group in T3, while a, d do not belong 
to the same component in r2. H 

We call a derivation {Tq, . . . ,Tt) to be strict if |cmp(Ti_i)| > |cmp(ri)| holds for all 1 < i < t. 

Lemma 1. If G has a k-derivation, it has a strict k-derivation. 

Proof. Lot V = {Tq, . . . ,Tt) be a fc-derivation of G. Assume there is some I < i < t such that 
cmp(Tj_i) = cmp(Ti). If also grp(Ti_i) — gTp{Ti), then T^^i — Ti, and we can safely remove Ti_i 
and still have a fc-derivation of G. Hence assume grp(Ti_i) ^ grp(Ti). This implies that i > 1. If 
i = t, then we can safely remove Tt from the derivation and (Tq, . . . ,Tt-i) is clearly a fc-derivation of 
G. Hence it remains to consider the case 1 < i < t— 1. We show that by dropping Ti we get a sequence 
V = (To,..., Ti_i, Ti+i, ...,Tt) that is a fc-derivation of G. 

The new sequence 2?' is clearly a fc-derivation. It remains to verify that 2?' is a derivation of G. 
The template T^+i is the only one where these properties might have been violated by the removal of 
Ti. However, since all three properties impose a restriction on the set of components of the template 
preceding T^+i, and since cmp(Ti_i) — cmp(Ti), the properties are not effected by the deletion of Ti. 
Hence V is indeed a fc-derivation of G. 

By repeated application of the above shortening we can turn any fc-derivation into a strict fc-deriva- 
tion. D 



Lemma 2. Every strict k-derivation of a graph with n vertices has length at most n — 1. 

Proof. Let (To,...,Tt) be a strict fc-derivation of a graph with n vertices. Since |cnip(To) 
|cmp(To)| = 1, it follows that t <n — 1. 
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In the proofs of the next two lemmas we need the following concept of a k-expression tree, which is 
the parse tree of a fc-expression equipped with some additional information. Let (/> be a fc-expression for 
a graph G = {V, E). Let Q be the parse tree of (p with root r. Consider a node x of Q and let (px be the 
subexpression of (j) whose parse tree is the subtree of Q rooted at x. Then x is labeled with the fc-graph 
Gx constructed by the fc-expression (j)^. Thus the leaves of Q are labeled with initial fc-graphs and the 
root r is labeled with a labeled version of G. One ©-node of the parse tree can represent several directly 
subsequent 0-operations (e.g., the operation (x®y)®z can be represented by a single node with three 
children) . Evidently, fc-expressions and their fc-expression trees can be effectively transformed into each 
other. 

We introduce some additional terminology on fc-expression trees. We call a non-leaf node of Q an 
©-node, 77-node, or p-node, according to the operation it represents. We define the rank R{x) of node x 
of Q as the largest number of ©-nodes that appear on a path from a leaf of Q to x. Hence leaves have 
rank 0. We denote the set of nodes of Q of rank i by Vi{Q). 

Lemma 3. From a k-expression of a graph G we can obtain a k-derivation of G in polynomial time. 

Proof. Let be a fc-expression of G = {V, E) and let Q be the corresponding fc-expression tree. We 
let t := R{r) and define a derivation V = {Tq, . . . ,Tt) by setting cmp(T,) = {V[Gx)) : x € V^{Q)} 
and grp(Ti) = V}x(^Vi(Q) S'^P('-'2;) where grp(Ga;) denotes the partition of V{Gx) into sets of vertices that 



have the same label. By construction, 2? is a derivation with universe V. Furthermore, since is a 
fc-expression, |grp(G'x)| < k for all nodes x of Q. Hence I? is a fc-derivation. It remains to show that V 
is a fc-derivation of G. Let 1 < i < t. 

To show that the edge property holds, consider two vertices u^v (£ V such that uv (z E and u, v are 
in the same group in T^. Assume to the contrary that u, v belong to different components ci, C2 in T^-i. 
Let X be the ©-node of rank i with u,v € V{Gx) S cmp(Ti), and let a:;i,a;2 be the children of x with 
V{Gxi) = c,\ and ViGx^^ = C2. Hence uv ^ E{Gxi) ^ E{Gxi)- However, since m, u are in the same 
group in T^, this means that w, v have the same label in Gx- Thus the edge uv cannot be introduced by 
an ?7-operation, and so uv (jz. E{Gr) = E, a contradiction. Hence the edge property holds. 

To show that the neighborhood property holds, consider three vertices u,v,w € V such that uv G E, 
uw ^ E, and v,w are in the same group of Tj. Assume to the contrary that u,v are in different 
components of Ti_i, say in components Ci and C2, respectively. Since v^w are in the same group of T,, 
they are also in the same component c of Ti. Let x be the ©-node of rank i with v,w d V{Gx) = c ^ 
cmp(Ti), and let a;i,X2 be the children of x. Clearly uv ^ E{Gxi) ^ E!{Gx2), hence there must be an 
ry-node y somewhere on the path between x and r where the edge uv is introduced. However, since v 
and w share the same label in Gx, they share the same label in Gy. Consequently, the ?7-operation that 
introduces the edge uv also introduces the edge uw. However, this contradicts the assumption that 
uw ^ E. Hence the neighborhood property holds as well. 

To show that the path property holds, we proceed similarly. Consider four vertices u,v,w,x € V, 
such that uv,uw,vx G E and wx ^ E. Assume that u,x are in the same group in Ti and v,w are 
in the same group in Ti. Assume to the contrary that u,v are in different components of Ti-i, say 
in components ci and C2, respectively. Above we have shown that the neighborhood property holds. 
Hence we conclude that u,w belong to the same component of T^-i, and v,x belong to the same 
component of Ti_i. Since u,x are in the same group in Ti, they are also in the same component of Ti, 
say in component c. Thus all four vertices u,v,w,x belong to c. Let x be the ©-node of rank i with 
V{Gx) = c £ cnip{Ti), and let a;i,a;2 be the children of x. Clearly uv ^ E{Gxi) U E{Gx2), hence 
there must be an jy-node y somewhere on the path between x and r where the edge uv is introduced. 
However, since v and w share the same label in Gx, and u and x share the same label in Gx, this also 
holds in Gy. Hence the 77-operation that introduces the edge uv also introduces the edge xw. However, 
this contradicts the assumption that xw ^ E. Hence the path property holds as well. We conclude 
that T> is indeed a fc-derivation of G. 

The above procedure for generating the fc-derivation can clearly be carried out in polynomial time. 

a 

Example 3. Consider the 3-expression (j) for the graph P4 of Example [TI Applying the procedure 
described in the proof of Lemma [3] we obtain the 3-derivation T> of Example [2j H 

Lemma 4. From a k-derivation of a graph G we can obtain a k-expression of G in polynomial time. 

Proof. Let V = (Tq, . . . ,Tt) be a fc-derivation of G = {V,E). Using the construction of the proof of 
Lemma [l] we can obtain a strict fc-derivation of G from any given fc-derivation of G. Hence we may 
assume, w.l.o.g., that T> is strict. Let C — Ui=o *^™P(^')- 

First we construct a fc-expression 00 that only contains ©-operations and initial graphs. We intro- 
duce each vertex v £ V with label 1. Next, for each component c of Ti that is the union of two or more 
components ci, . . . , Cs of Tq, we introduce s — 1 ©-operations that represent this union. We proceed 
in this manor for T2, . . . ,Tt, and end up with the fc-expression c/)^. Let Q^ denote the corresponding 
fc-expression tree. 

We observe that for every component c £ C there is a unique ©-node Xc of Q© such that V{Gx) = c. 

In the next step we add to Q© certain p- nodes to obtain the fc-expression tree Q©,p. Consider a 
component c £ C. Let Ti be the template with smallest i such that c G cmp(Ti) and consider the 
©-node Xc labeled by Gx^. Let Xc-^, . . . ,Xc^ be the children of Xc. We add at most fc many p-nodes 
between Xc and x^, for 1 < i < s, such that grp(G'^3) C grp(Ti). This is possible, since by the definition 
of a derivation, c G cmp(Ti) can be written as c = ci U ••• U Cs for ci, ... ,Cs G cmp(Ti_i), and the 
partition of c induced by grp(T_i) is a refinement of the partition of c induced by grp(Ti). We do 
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the same for all components c £ C, and obtain this way a fc-expression tree (5©,p and a corresponding 
/c-expression (f>^^p. 

As a final step, we add ry-nodes to Q©,p in order to obtain the /c-expression tree Q. Let uv £ E he an 
edge of G. We show that there is an ©-node in (5©,p above which we can add an ry-node that introduces 
edges including uv but does not introduce any edge not present in E. 

Let i be the smallest index such that u and v belong to the same component of T^. Let this 
component be c, and consider the corresponding 0-node Xc labeled with the /c-graph Gx^ ■ Because of 
the neighborhood property, u and v must belong to different groups of Ti , and consequently, they must 
have different labels in Gx^, say labels a and b, respectively. We add an 77-node above Xc representing 
the operation rja.b- This inserts the edge uv to Gx^- We need to show that rja.b does not add any edge 
that is not in E. We show that for all pairs of vertices u' ,v' G c where u' has label a and v' has label b 
in Gx^, the edge u'v' is in E. 

We consider four cases. 

Case 1: u — u' , v — v' . Trivially, u'v' = uv (£ E. 

Case 2: u = u',v ^ v' . Assume to the contrary that u'v' — uv' ^ E. Since v and v' have the same 
label in Gx^, they belong to the same group of Ti. The neighborhood property implies that u and v 
belong to the same component of Ti_i, a contradiction to the minimal choice of i. Hence u'v' G E. 

Case 3: u ^ u' ,v = v' . This case is symmetric to Case 2. 

Case 4: u y^ u',v ^ v'. It follows by from Cases 2 and 3 that uv',vu' G E. The path property 
implies that u and v belong to the same component of Tt^i, a contradiction to the minimal choice of i. 
Hence u'v' g E. 

Consequently, we can successively adding 7y-nodes to Q©,p until all edges of E are inserted, but no 
edge outside of E. Let Q be the obtained fc-expression tree and let (p be the corresponding fc-expression. 
By the above arguments, is a fc-expression for G. 

The above procedure for generating the fc-expression can clearly be carried out in polynomial time. 

□ 

We note that we could have saved some p-operations in the proof of Lemma [4] In particular the 
fc-expression produced may contain p-operations where the number of different labels before and after 
the application of the p-operation remains the same. It is easy to see that such a p-operations can be 
omitted if we change labels of some initial fc-graphs accordingly. 

Example 4. Consider the derivation V of graph G in Example [2j We construct a 3-expression of G 
using the procedure as described in the proof of Lemma |4J First we obtain 0© = ((1(a) ® 1(6)) © 
1(c)) © l(rf). Next we insert p operations to represent how the groups evolve through the derivation: 

(j>^.p = pi^2((l(a) ffi Pi^2(l(6)) © pi-i.3l(c))) © 1(d). Finally we add 7/ operations, and obtain <t>(B,p,n = 
??l,2(pi^2('72,3(»7l,2(l(a) © pi^2{l{h))) © pi^3l(c))) © 1(d)). H 

By Lemma [2] we do not need to search for fc-derivations of length > n — 1 when the graph under 
consideration has n vertices. The next lemma improves this bound to n — fc + 1 which provides a 
significant improvement for our SAT encoding, especially if the graph under consideration has large 
clique- width. 

Lemma 5. Let 1 < k < n. If a graph with n vertices has a k-derivation, then it has a k-derivation of 
length n — fc + 1 . 

Proof. Let fc > 1 be fixed. We define the k-length of a derivation (Tq, . . . , Tj) as the integer t — i where i 
is the largest index 1 < i < t where all components of Ti have size at most fc. Let i(n, fc) be the smallest 
integer such that the fc-length of any strict derivation over a universe of size n is at most ^(n, fc). Before 
we show the lemma, we establish three claims. For these claims, the groups of the considered derivations 
are irrelevant and hence we will ignored. 

Claim 1: li n < k, then e{n, fc) < ^(n -|- 1, fc). 

To show the claim, consider a strict derivation V — (Tq, . . . ,Tt) over a universe V of size n with 
fc-length i. We take a new element a and form a strict derivation V over the universe V U {a} by adding 
the singleton {a} to cmp(Ti) for < i < i and adding a new template Tt+i with cmp(rt+i) = {yu{a}}. 
The new derivation V' has fc-length £ + 1. 

7 



Claim 2: Let V = (Tq, . . . ,Tt) be a strict derivation over a universe V of size n of fc-length i?(n, fc). 
Then, T'j_^(-„ j.^^]^ has exactly one component of size fc + 1 and aU other components are singletons. 

We proceed to show the claim. Let j = t — £(n,k)^ and observe that j is the largest index where 
all components of Tj have size at most k. Thus T^+i has components ci, . . . ,Cr, r > 1, of size greater 
thank k. We show that r ~ 1. Assume to the contrary that r > I. We pick some element ai G Ci, 
2 < i < r, and set X — 1J[^2 '^i \ i'^i}- The derivation V induces a strict derivation V over the universe 
V ^ V\X. Observe that n' = \V'\ < \V\ = n. Evidently V has the same fc-length as V, hence 
£{n',k) > i{n,k), a contradiction to Claim 1. Hence r = 1, and ci is the only component in Tj^i of 
size greater than fc. We show that |ci| = fc + 1. We assume to the contrary that |ci| > fc + 1. We pick 
fc + 1 elements bi, . . . , bk+i G ci and set X ~ ci\ {hi, . . . , b^+i}. Similarly as above, we observe that V 
induces a strict derivation V" over the universe V" = V \X, and that V" has the same fc-length as V. 
Since \V'\ < \V\ we have again a contradiction to Claim 1. Hence Claim 2 is established. 

Claim 3: £{n, k) < n — k. 

To see the claim, let V = (To, . . . , T^) be a strict derivation over a universe V of size n of fc-length 
£{n, fc). Let j — t — i{n, fc). By Claim 2 we know that Tj+i has exactly one component of size fc -|- 1 
and all other components are singletons (hence there are n — k — 1 singletons). We conclude that 
|cmp(Tj+i)| = n— fc. Since P is strict, wehaven— fc — |cmp(rj_|_i)| > |cmp(Tj_|_2)| > ■ ■ • > |cmp(rt)| = 1. 
Thus i{n, k) ^ t — j < n ~ fc, and the claim follows. 

We are now in the position to establish the statement of the lemma. Let V = {Tq, . . . ,Tt) be a 
fc-derivation of a graph G = {V,E) with \V\ = n. By Lemma [1] we may assume that V is strict. 
Let £ be the fc-length of V. By Claim 3 we know that £ < n ~ k. We define a new template T' with 
cmp(Tj) = cmp(rj) and grp(rj) = grp(To), and we set V = Tq, Tj, T,+i, . . . , T*. We claim that V is a 
fc-derivation of G. Clearly V is a derivation, but we need to check the edge, neighborhood, and path 
property for Tj and T^+i in V. The properties hold trivially for Tj since all its groups are singletons. 
For Tj+i the properties hold since T' has the same components as Tj. Thus V is indeed a fc-derivation 
of G. The length of P' is ^ -I- 1 < n — fc -I- 1, hence the lemma follows. D 

Exam,ple 5. Again, consider the derivation T) of Example^ T) defines P4 which has clique- width 3 [S]. 
According to Lemma [5] it should have a derivation of length n— fc-fl=4 — 3 + 1 = 2. We can obtain 
such a derivation by removing Ti from T), which gives T)' = (Tq, T2, T3). H 

By combining Lemmas [3] |4l and [5] we arrive at the main result of this section. 

Proposition 1. Let I < k < n. A graph G with n nodes has clique-width at m,ost k if and only if G 
has a k-derivation of length at most n — fc + 1. 

4 Encoding a Derivation of a Graph 

Let G — {V,E) be graph, and i > an integer. We are going to construct a CNF formula Fdci-{G,t) 
that is satisfiable if and only if G has a derivation of length t. We assume that the vertices of G are 
given in some arbitrary but fixed linear order <. 

For any two distinct vertices u and w of G and any < i < t we introduce a component variable 
Cu,v.i- Similarly, for any two distinct vertices u and w of G with u < v and any < i < t we introduce a 
group variable gu.v.i- Intuitively, Cu^v,i or gu,v,i a-re true if and only if u and v are in the same component 
or group, respectively, in the ith template of an implicitly represented derivation of G. 

The formula Fdcr (G,t) is the conjunction of all the clauses described below. 
The following clauses represent the conditions D1-D4. 

for u,v£V,u<v,0<i<t. 

We further add clauses that ensure that the relations of being in the same group and of being in the 
same component are transitive. 



ioT u,v,w Q V, u < V < w, < i < t. 

In order to enforce the erfge property we add the foUowing clauses for any two vertices u, w € y with 
u < V, uv Cz E and 1 < * < ^: 

Further, to enforce the neighborhood property, we add for any three vertices u,v,w € T^ with uv d E 
and uw ^ i? and 1 < i < i, the foUowing clauses. 

\^niin{u,v)^ni3.x(u,v),i—l ^ 9min(v^w)^ina.y:(v,w)^i ) 

Finally, to enforce the path property we add for any four vertices u,v,w,x, such that uv,uw,vx € E, 
and wx ^ E, u < v and 1 < i <t the following clauses: 

\^u^v^i—l V yniin(u,a:),max(u,x),z ^ i/min(v,u;),max(i;,'u;),i/ 

The following statement is a direct consequence of the above definitions. 
Lemma 6. _fdcr(G, t) is satisfiable if and only if G has a derivation of length t. 

5 Encoding a /c-Derivation of a Graph 

In this section, we describe how the formula _Fdcr(G, t) can be extended to encode a derivation of width 
at most k. Ideally, one wants to encode that unit propagation results in a conflict on any assignment 
of component and group variables representing a derivation containing a component with more than 
k groups. First we will describe the conventional direct encoding [35' followed by our representative 
encoding. Only the latter encoding realizes arc consistency |18) . 

5.1 Direct Encoding 

We introduce new Boolean variables lv,a,i for v G V^, 1 < a < k, and < i < t. The purpose is to 
assign each vertex for each template a group number between 1 and k. The intended meaning of a 
variable lv,a,i is that in T^, vertex v has group number a. Let F{G,k,t) denote the formula obtained 
from i^der(G,t) by adding the following three sets of clauses. The first ensures that every vertex has at 
least one group number, the second ensures that every vertex has at most one group number, and the 
third ensures that two vertices of the same group share the same group number. 

{ly,i,t V lj,,2,i V • • • V ly^k^t) ioT V eV,0<i<t, 

{lv,a,i V lv,b,i) ioTveV,l<a<b<k, 0<i<t, 

ioi u,v £V , u < V, l<a<k,0<i<t. 
Together with Lemma [6] this construction directly yields the following statement. 

Proposition 2. Let G — {V, E) be graph and t — |V^| — /s + 1. Then F{G, k, t) is satisfiable if and only 
i/cwd(G) < k. 

Example 6. Let G = (V, E) and k = 2. Vertices u,v,w € V in template T^, are in one component, but in 
different groups. Hence the corresponding component variables are true, and the corresponding group 
variables are false. The clauses containing the variables lu,a,i, lv,a.i, lw,a.i with a S {1, 2} after removing 
falsified literals are: 

{lu,l,i V l_u,2,i) A {ly,l,i V lv,2,i) A {lw,l,i V lw,2,i) A {lu,l,i V [u,l,i) A {lu,l.i V lwA,i) A 
(^D,l,j V lw,l,i) A {lu,2,i V lv,2,i) A {lu,2,i V lw,2,i) A {lv,2,i V lw,2,i) 

These clauses cannot be satisfied, yet unit propagation will not result in a conflict. Therefore, a SAT 
solver may not be able to cut off the current branch. H 



5.2 The Representative Encoding 

To overcome the unit propagation problem of the direct encoding, as described in ExamplelGJ we propose 
the representative encoding which uses two types of variables. First, for each v (^ V and 1 < i < t we 
introduce a representative variable r„_i. This variable, if assigned to true, expresses that vertex v is the 
representative of a group in template T^. In each group, only one vertex can be the representative and 
we choose to make the first vertex in the lexicographical ordering the representative. This results in the 
following clauses: 

irv,i V \Juev,u<v 9n,v,i) A l\uev,u<vi'^v,i V gu,v,i) for w e F, < i < i 

Additionally we introduce auxiliary variables to efficiently encode that the number of representative 
vertices in a component is at most k. These auxiliary variables are based on the order encoding |34) . 
Consider a (non-Boolean) variable Ly^i with domain D = {1, . . . , fc}, whose elements denote the group 
number of vertex v in template Ti. In the direct encoding, we used k variables lv.a,i with a (z D. 
Assigning ly,a.i = 1 in that encoding means L^i — a. Alternatively, we can use order variables o^^^ 
with V &V T a & D\ {fc}, < « < t. Assigning o^ ^ ^ — 1 means L„^i > a. Consequently, o^ ^ ,- = means 

Example 7. Given an assignment to the order variables o^^^, one can easily construct the equivalent 
assignment to the variables in the direct encoding (and the other way around). Below is a visualization 
of the equivalence relation with fc = 5. In the middle is a binary representation of each of the k labels 
by concatenating the Boolean values to the order variables. 
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H 

Although our encoding is based on the variables from the order encoding, we use none of the asso- 
ciated clauses. We implemented the original order 34J, which resulted in many long clauses and the 
performance was comparable to the direct encoding. 

Instead, we combined the representative and order variables. Our use of the order variables can be 
seen as the encoding of a sequential counter [32] • We would like to point out that if u and v are both 
representative vertices in the same component of template Tj and u < v, then o^ ^^ — and o^ ^^^ = 1 
must hold for some 1 < a < fc. Consequently, o^ i^_^ ,■ = (vertex u has not the highest group number in 
Ti), o^]^ j — 1 (vertex v has not the lowest group number in T^), and o„ ^ ^ — > o^ ^^i f These constraints 
can be expressed by the following clauses. 

icu,v,i V fu,t V f„,i V o>,,_;^ -) A {cu,vA V ruA V fy^^ V o>^ -) A 

Al<a<k-li^u,VA V fu^^ V fy^, V o>^^ V 0>„^i J ioTu,veV,u<v,0<i<t. 

Example 8. Consider a graph G = {V, E) with u, w, w,x &V and the representative encoding with fc = 3. 
We will show that if u,v,w, and x are all in the same component and they are all representatives of their 
respective group numbers in template T^, then unit propagation will result in a conflict (because there 
are four representatives and only three group numbers). Observe that all corresponding component 
and representative variables are true. This example, with falsified literals removed, contains the clauses 

(0«,2,»)> (0^,l,zVO>2,i), (<!,»), iOu,2,^)^ (o^,l,» V 0>^2,i)' (o»,l,^)> io>2i), (o> i^, V 0> 2,j), (<i,J, (o>2,i)> 

(<i,iVo>^2^.), (o>i^,), (o>2_,), (o>i^,Vo>2^,), (o>i^,), (o>_2_,), (o>^,^,Vo>2^,), (o>iJ. Literals that 
are falsified by unit clauses are shown in bold. Notice that {dy^ ^ V o^ 2 i) i^ falsified, i.e., a conflicting 
clause. H 



Both the direct and representative encoding require n{n -f- fc — l)(n — fc -t- 2) variables. The number 
of clauses depends on the set of edges. In worst case, the number of clauses can be 0{n^ — n'^k) due to 
the path condition. 
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6 Experimental Results 

In this section we report the results we obtained by running our SAT encoding on various classes of 
graphs. Given a graph G — (V,£'), we compute that G has clique- width k by determining for which 
value of k it holds that F{G, k, \V\ — k + 1) is satisfiable and F{G, k ~ 1,\V\ — k + 2) is unsatisfiable. 
We used the SAT solver Glucose version 2.2 ^j to solve the encoded problems. Glucose solved the 
hardest instances about twice as fast (or more) as other state-of-the-art solvers such as Lingeling [5, 
Minisat [H] and Clasp [T^. We used a 4-core Intel Xeon CPU E31280 3.50GHz, 32 Gb RAM machine 
running Ubuntu 10.04. 

Although the direct and representative encodings result in CNF formulas of almost equal size, there 
is a huge difference in costs to solve these instances. To determine the clique-width of the famous named 
graphs (see below) using the direct encoding takes about two to three orders of magnitude longer as 
compared to the representative encoding. For example, we can establish that the Paley graph with 
13 vertices has clique-width 9 within a few seconds using the representative encoding, while the solver 
requires over an hour using the direct encoding. Because of the huge difference in speed, we discarded 
the use of the direct encoding in the remainder of this section. 

We noticed that upper bounds (satisfiable formulas) are obtained much faster than lower bounds 
(unsatisfiable formulas). The reason is twofold. First, the whole search space needs to be explored for 
lower bounds, while for upper bounds, one be can be "lucky" and find a solution fast. Second, due to 
our encoding, upper bound formulas are smaller (due to a smaller t) which makes them easier. Table fl] 
shows this for a random graph with 20 vertices for the direct encoding and the representative encoding. 

Table 1: Runtimes in seconds of the direct and representative encoding on a random graph with 20 
vertices and 95 edges for different values of k. Up to fc = 9 the formulas are unsatisfiable, afterwards 
they are satisfiable. Timeout (TO) is 20,000 seconds. 

fc I 3 4 5 6 7 8 9 I 10 11 12 13 14 15 16 



direct 
repres 



1.39 14.25 101.1 638.5 18,337 TO TO 
0.62 2.12 8.14 12.14 33.94 102.3 358.6 



TO TO 30.57 0.67 0.50 0.10 0.10 
9.21 0.40 0.35 0.32 0.29 0.29 0.28 



We examined whether adding symmetry-breaking predicates could improve performance. We used 
Saucy version 3 for this purpose |25j. After the addition of the clauses with representative variables, the 
number of symmetries is drastically reduced. However, one can generate symmetry-breaking predicates 
for Fder(G, i) and add those instead. Although it is helpful in some cases, the average speed-up was 
between 5 to 10%. 

Our experimental computations are ongoing. Below we report on some of the results we have obtained 
so far. 

6.1 Random Graphs 

The asymptotics of the clique- width of random graphs have been studied by Lee et al. j 27j . Their 
results show that for random graphs on n vertices the following holds asymptotically almost surely: If 
the graphs are very sparse, with an edge probability below 1/n, then clique-width is at most 5; if the 
edge probability is larger than 1/n, then the clique-width grows at least linearly in n. Our first group of 
experiments complements these asymptotic results and provides a detailed picture on the clique-width 
of small random graphs. We have used the SAT encoding to compute the clique-width of graphs with 
10, 15, and 20 vertices, with the edge probability ranging from to 1. A plot of the distribution is 
displayed in Figure [2] It is interesting to observe the symmetry at edge probability 1/2, and the how 
the steepness of the curve increases with the number of vertices. Note the "shoulders" of the curve for 
very sparse and very dense graphs. 
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Figure 2: Average clique- width of random graphs with edge probabihties between and 1. Each dot in 
the graph represents the average chque-width of 100 graphs. 



6.2 The Clique- Width Numbers 

For every k > 0, let nfc denote the smallest number such that there exists a graph with n^ vertices 
that has clique-width k. We call Uk the fc'th clique-width number. From the characterizations known 
for graphs of clique-width 1, 2, and 3, respectively 0, it is easy to determine the first three clique- 
width numbers (1, 2, and 4). However, determining n^ is not straightforward, as it requires nontrivial 
arguments to establish clique-width lower bounds. We would like to point out that a similar sequence 
for the graph invariant treewidth is easy to determine, as the complete graph on n vertices is the smallest 
graph of treewidth n—1. One of the very few known graph classes of unbounded clique- width for which 
the exact clique- width can be determined in polynomial time are grids |23j ; the kx k grid with A; > 3 has 
clique- width k + 1 [20j . Hence grids provide the upper bounds n^ < 9, n^ < 16, ng < 25, and n-^ < 36. 
With our experiments we could determine n^ = 6, ris = 8, rig — 10, nj — 11, ng, < 12, and ng < 13. 
It is known that the path on four vertices (P4) is the unique smallest graph in terms of the number 
of vertices with clique-width 3. We could determine that the triangular prism (3-Prisni) is the unique 
smallest graph with clique-width 4, and that there are exactly 7 smallest graphs with clique-width 5. 
There are 68 smallest graphs with clique- width 6 and one of them has only 18 edges. See Figure [3] for an 
illustration. Additionally, we found several graphs of size 11 with clique-width 7 by extending a graph 
of size 10 with clique- width 6. 






Figure 3: Smallest graphs with clique-width 3, 4, 5, and 6 (from left to right). 

Proposition 3. The clique-width sequence starts with the numbers 1, 2, 4, 6, 8, 10, 11. 

We used Brendan McKay's software package Nauty |28] to avoid checking isomorphic copies of the 
same graph. There are several other preprocessing methods that can speed up the search for small 
graphs of prescribed clique-width fc > 2. Obviously, we can limit the search to connected graphs, as the 
clique-width of a graph is clearly the maximum clique-width of its connected components. We can also 
ignore graphs that contain twins — two vertices that have exactly the same neighbors — as we can delete 
one of them without changing the clique-width. Similarly, we can ignore graphs with a universal vertex, 
a vertex that is adjacent to all other vertices, as it can be deleted without changing the clique- width. All 
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these filtering steps are subsumed by the general concept of prime graphs. Consider a graph G = (V, E). 
A vertex u ^ V distinguishes vertices v,w ^ V ii uv ^ E and uw ^ E. A set M C 1/ is a module if 
no vertex from V \M distinguishes two vertices from M. A module M is trivial if \M\ e {0, 1, \V\}. 
A graph is prime if it contains only trivial modules. It is well-known that the clique-width of a graph 
is either 2 or the maximum clique- width of its induced prime subgraphs [9]. Hence, in our search, we 
can ignore all graphs that are not prime. We can efficiently check whether a graph is prime |21| . The 
larger the number of vertices, the larger the fraction of non-prime graphs (considering connected graphs 
modulo isomorphism). Table [2] gives detailed results. 

Table 2: Number of connected and prime graphs with specified clique-width, modulo isomorphism. 
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1,873 
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9 


261,080 


145,870 





16,348 


125,364 




4,158 





10 


11,716,571 


8,110,354 





142,745 


5,520,350 


2,447,190 


68 



6.3 Famous Named Graphs 

The graph theoretic literature contains several graphs that have names, sometimes inspired by the 
graph's topology, and sometimes after their discoverer. We have computed the clique-width of several 
named graphs, the results are given in Table p^ (definitions of all considered graphs can be found in 
MathWorld [37]). The Paley graphs, named after the English mathematician Raymond Paley (1907- 
1933), stick out as having large clique- width. Our results on the clique-width of Paley graphs imply 
some upper bounds on the 9th and 11th clique-width numbers: ng < 13 and nn < 17. 

7 Conclusion 

We have presented a SAT approach to the exact computation of clique-width, based on a reformulation 
of clique-width and several techniques to speed up the search. This new approach allowed us to system- 
atically compute the exact clique-width of various small graphs. We think that our results could be of 
relevance for theoretical investigations. For instance, knowing small vertex-minimal graphs of certain 
clique-width could be helpful for the design of discrete algorithms that recognize graphs of bounded 
clique-width. Such graphs can also be useful as gadgets for a reduction to show that the recognition 
of graphs of clique- width 4 is NP-hard, which is still a long-standing open problem [16 . Furthermore, 
as discussed in Section [T] there are no heuristic algorithms to compute the clique-width directly, but 
heuristic algorithms for related parameters can be used to obtain upper bounds on the clique-width. 
Our SAT-based approach can be used to empirically evaluate how far heuristics are from the optimum, 
at least for small and medium-sized graphs. 

So far we have focused in our experiments on the exact clique- width, but for various applications it is 
sufficient to have a good upper bound. Our results (see Tablejl]) suggest that our approach can be scaled 
to medium-sized graphs for the computation of upper bounds. We also observed that for many graphs 
the upper bound of Lemma [5] is not tight. Thus, we expect that if we search for shorter derivations, 
which is significantly faster, this will yield optimal or close to optimal solutions in many cases. 

Finally, we would like to mention that our SAT-based approach is very flexible and open. It can 
easily be adapted to variants of clique- width, such as linear clique- width [551 [inj , m-clique- width [T^] , 
or NLC-width |;36'. Hence, our approach can be used for an empirical comparison of these parameters. 

13 



Table 3: Clique-width of named graphs. Sizes are reported for the unsatisfiables. 



graph 


1^1 


\E\ 


cwd 


variables 


clauses 


UNSAT 


SAT 


Brinkmann 


2f 


42 


10 


8,526 


163,065 


3,932.56 


1.79 


Chvatal 


12 


24 


5 


1,800 


21,510 


0.40 


0.09 


Clebsch 


16 


40 


8 


3,872 


60,520 


191.02 


0.09 


Desargues 


20 


30 


8 


7,800 


141,410 


3,163.70 


0.26 


Dodecahedron 


20 


30 


8 


7,800 


141,410 


5,310.07 


0.33 


Errera 


17 


45 


8 


4,692 


79,311 


82.17 


0.16 


Flower snark 


20 


30 


7 


8,000 


148,620 


276.24 


3.9 


Folkman 


20 


40 


5 


8,280 


168,190 


11.67 


0.36 


Franklin 


12 


18 


4 


1,848 


21,798 


0.07 


0.04 


Frucht 


12 


18 


5 


1,800 


20,223 


0.39 


0.02 


Hoffman 


16 


32 


6 


4,160 


64,968 


8.95 


0.46 


KittcU 


23 


63 


8 


12,006 


281,310 


179.62 


18.65 


McGee 


24 


36 


8 


13,680 


303,660 


8,700.94 


59.89 


Sousselier 


16 


27 


6 


4,160 


63,564 


3.67 


11.75 


Paley-13 


13 


39 


9 


1,820 


22,776 


12.73 


0.05 


Paley-17 


17 


68 


11 


3,978 


72,896 


194.38 


0.12 


Pappus 


18 


27 


8 


5,616 


90,315 


983.67 


0.14 


Petersen 


10 


15 


5 


1,040 


9,550 


0.10 


0.02 


Poussin 


15 


39 


7 


3,300 


50,145 


9.00 


0.21 


Robertson 


19 


38 


9 


6,422 


112, 461 


478.83 


0.76 


Shrikhandc 


16 


48 


9 


3,680 


59,688 


129.75 


0.11 
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